The importance of study design for detecting differentially 1 abundant features in high - throughput experiments
نویسندگان
چکیده
8 The use of high-throughput experiments, such as RNA-seq, to simultaneously identify 9 differentially abundant entities across conditions has become widespread, but the systematic 10 planning of such studies is currently hampered by the lack of general-purpose tools to do so. 11 Here we demonstrate that there is substantial variability in performance across statistical 12 tests, normalization techniques and study conditions, potentially leading to significant 13 wastage of resources and/or missing information in the absence of careful study design. We 14 present a broadly applicable experimental design tool called EDDA, and the first for single15 cell RNA-seq, Nanostring and Metagenomic studies, that can be used to i) rationally choose 16 from a panel of statistical tests, ii) measure expected performance for a study and iii) plan 17 experiments to minimize mis-utilization of valuable resources. Using case studies from recent 18 single-cell RNA-seq, Nanostring and Metagenomics studies, we highlight its general utility 19 and, in particular, show a) the ability to correctly model single-cell RNA-seq data and do 20 comparisons with 1/5 the amount of sequencing currently used and b) that the selection of 21 suitable statistical tests strongly impacts the ability to detect biomarkers in Metagenomic 22 studies. Furthermore, we demonstrate that a novel mode-based normalization employed in 23 EDDA uniformly improves in robustness over existing approaches (10-20%) and increases 24 precision to detect differential abundance by up to 140%. 25
منابع مشابه
Genome analysis An Informative Approach on Differential Abun- dance Analysis for Time-course Metagenomic Se- quencing Data
Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g., species or genes) plays a critical role in revealing the contributors (i.e., pathogens) to the biological or medical status of micr...
متن کاملDiagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets
With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...
متن کاملRapid and high throughput regeneration in fennel (Foeniculum vulgare Mill.) from embryo explants
Callus induction and regeneration of fennel from embryo explants were stabilized in the presence of cefotaxime antibiotic and different plant growth regulators (PGRs). The experiments were conducted under a factorial experiment, based on a completely randomized design (CRD). Genotypes; Fasa, Meshkinshar and Hajiabad were applied under different concentration of cefotaxime (0 and 100 mg l-1...
متن کاملA Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks
The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in...
متن کاملNootropic Medicinal Plants; Evaluating Potent Formulation By Novelestic High throughput Pharmacological Screening (HTPS) Method
The principle of this method was to screen the pharmacological activity of six prepared polyphyto formulations by using high throughput screening method for their nootropic action. The study was performed in three stages using one, two and three animals, respectively in a group. Test formulations were given p.o daily at the dose of 50 and 100 mg/kg body weight. The test formulations were compar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013